Reconstructing fully-resolved trees from triplet cover distances
نویسندگان
چکیده
It is a classical result that any finite tree with positively weighted edges, and without vertices of degree 2, is uniquely determined by the weighted path distance between each pair of leaves. Moreover, it is possible for a (small) strict subset L of leaf pairs to suffice for reconstructing the tree and its edge weights, given just the distances between the leaf pairs in L. It is known that any set L with this property for a tree in which all interior vertices have degree 3 must form a cover for T – that is, for each interior vertex v of T , L must contain a pair of leaves from each pair of the three components of T − v. Here we provide a partial converse of this result by showing that if a set L of leaf pairs forms a cover of a certain type for such a tree T then T and its edge weights can be uniquely determined from the distances between the pairs of leaves in L. Moreover, there is a polynomial-time algorithm for achieving this reconstruction. The result establishes a special case of a recent question concerning ‘triplet covers’, and is relevant to a problem arising in evolutionary genomics.
منابع مشابه
Comparing and Aggregating Partially Resolved Trees
We define, analyze, and give efficient algorithms for two kinds of distance measures for rooted and unrooted phylogenies. For rooted trees, our measures are based on the topologies the input trees induce on triplets; that is, on three-element subsets of the set of species. For unrooted trees, the measures are based on quartets (four-element subsets). Triplet and quartet-based distances provide ...
متن کاملEfficient algorithms for computing the triplet and quartet distance between trees of arbitrary degree
The triplet and quartet distances are distance measures to compare two rooted and two unrooted trees, respectively. The leaves of the two trees should have the same set of n labels. The distances are defined by enumerating all subsets of three labels (triplets) and four labels (quartets), respectively, and counting how often the induced topologies in the two input trees are different. In this p...
متن کاملComputing Refined Buneman Trees in Cubic Time
Reconstructing the evolutionary tree for a set of n species based on pairwise distances between the species is a fundamental problem in bioinformatics. Neighbor joining is a popular distance based tree reconstruction method. It always proposes fully resolved binary trees despite missing evidence in the underlying distance data. Distance based methods based on the theory of Buneman trees and ref...
متن کاملCombinatorial properties of triplet covers for binary trees
It is a classical result that an unrooted tree T having positive real-valued edge lengths and no vertices of degree two can be reconstructed from the induced distance between each pair of leaves. Moreover, if each non-leaf vertex of T has degree 3 then the number of distance values required is linear in the number of leaves. A canonical candidate for such a set of pairs of leaves in T is the fo...
متن کاملThe mean value of the squared path-difference distance for rooted phylogenetic trees
The path-difference metric is one of the oldest distances for the comparison of fully resolved phylogenetic trees, but its statistical properties are still quite unknown. In this paper we compute the mean value of the square of the path-difference metric between two fully resolved rooted phylogenetic trees with n leaves, under the uniform distribution. This complements previous work by Steel an...
متن کامل